Bayesian Melding

Background for the Approach

The Bayesian melding approach was proposed by Poole et al. (Poole and Raftery 2000).

This approach enables us to account for both uncertainty from inputs and outputs of a deterministic model. The initial motivation for the approach was to study the population dynamics of whales in the presence of substantial uncertainty around model inputs for population growth (Poole and Raftery 2000). However, the framework provided by Poole et al. can applied in any circumstance where we have uncertainty around some quantities \(\theta\) and \(\phi\) where there is a deterministic function \(M:\theta \to\phi\). Due the utility of Bayesian melding in various contexts, since this deterministic model \(M\) could take on a wide range of forms, the approach has since been applied in various fields, including urban simulations (Ševčíková, Raftery, and Waddell 2007), ecology (Robson 2014), and infectious disease (Powers et al. 2011).

Let \(M: \theta \to \phi\) be the deterministic model defined by the function relating a vector of input parameters \(\theta\) to an output vector \(\phi\), and suppose we have a prior on \(\theta\) denoted \(f_\theta(\theta)\) and a prior on \(\phi\) denoted \(f_\phi^{direct}(\phi)\).

However, note that we actually have two distinct priors on \(\phi\). There is the prior formed by the distribution induced on \(\phi\) by the prior for \(\theta\) and the function \(M\), where we denote this induced prior \(f_\phi^{induced}(\phi)\). Generally, these priors are based on different sources of information.

If \(M^{-1}\) exists, we can write this induced prior \(f_\phi^{induced}(\phi) = f_\theta(M^{-1}(\phi)) |J(\phi)|\). This result follows from the fact \(M(\theta) = \phi\), so we apply a change of variables to obtain the distribution of \(\phi\) from the distribution of \(M(\theta)\).

In practice, \(M^{-1}\) rarely exists exists since \(\theta\) is often of higher dimensionality then \(\phi\), in which cases \(M\) is not invertible. This means we generally approximate \(q^*_1(\phi)\) without acquiring its analytical form.

In addition to this induced prior, we have the prior \(f_\phi^{direct}(\phi)\), which does not involve \(M\) nor the inputs \(\theta\). Since these priors are based on different sources of information and may reflect different uncertainties, often it useful to use both sources of information to inform our estimates. To do so, we need to combine the distributions for \(q^*_1(\phi)\) and \(f_\phi^{direct}(\phi)\) to create a pooled distribution.

Multiple pooling strategies exist for distinct distributions, but one requirement for a Bayesian analysis is that the distribution should be independent of the order in which the prior is updated and the combining of the prior distribution. That is, updating the prior distributions using Bayes’ theorem and then combining distributions should yield the same result as combining distributions and then updating this combined distribution; pooling methods that have this property are deemed externally Bayesian. Logarithmic pooling has been shown to be externally Bayesian under some conditions, which are likely to hold in most settings. Furthermore, logarithmic pooling has actually been shown to be the only pooling method where this holds (Genest, McConway, and Schervish 1986). For this reason, Poole et al. recommend proceeding with logarithmic pooling for Bayesian melding.

The logarithmically pooled prior for \(\phi\) by pooling \(q^*_1(\phi)\) and \(f_\phi^{direct}(\phi)\) is proportional to

\[(f_\phi^{induced}(\phi))^{\alpha} (f_\phi^{direct}(\phi))^{1-\alpha}\]

where \(\alpha \in [0,1]\) is a pooling weight. Commonly, a choice of \(\alpha = 0.5\) is used to give the priors equal weight. In this case, logarithmic pooling may be referred to as geometric pooling since it is equivalent to taking a geometric mean.

Intuition for Definition of the Sampling Weights

If \(M\) is invertible, we can obtain the contrained distributions for the model inputs by simply inverting \(M\). However, \(M\) is rarely invertible, so we have to think about how to proceed in the noninvertible case.

Simple Noninvertible Discrete Example

To develop intuition the strategy Poole and Raftery (2000) recommend for handling the case where \(\phi\) is not invertible, we consider a mapping \(M: \theta \to \phi\) for \(\theta \in \mathbb{R}\) and \(\phi \in \mathbb{R}\) defined as follows (Figure 1). Note the choice of \(f_\theta,f_\phi^{direct}\) does not matter here as long as they are valid densities.

library(gridExtra)
library(grid)

# plot(mpg ~ hp, data = mtcars)


library(kableExtra)
df <- tibble::tibble(
             `$\\theta$` = c(1,2,3),
             `$f_\\theta(\\theta)$` = c(.3,.2,.5),
             `$M(\\theta)=\\phi$` = c(1,2,2),
             `$f_\\phi^{direct}(\\phi)$` = c(0.4,0.6,0.6))

df %>%
  ggplot(aes(x = `$\\theta$`, y = `$M(\\theta)=\\phi$`)) +
  geom_point(size = 2, color = "darkblue") +
  labs(x = TeX("$\\theta$"), y = TeX("$M(\\theta)$")) +
  theme_bw() +
  theme(axis.title = element_text(size = 18),
        axis.text = element_text(size = 7)) +
  scale_y_continuous(breaks = c(1,1.5,2), limits = c(.8,2.2)) 
A simple discrete example where $M$ is not invertible, since $M(2)=M(3)=2$.

Figure 1: A simple discrete example where \(M\) is not invertible, since \(M(2)=M(3)=2\).

kbl(df, escape=FALSE, 
    caption = "\\label{tab:example-dist}Specifing the Distributions of $f_\\theta(\\theta)$ and $f_\\phi^{direct}(\\phi)$") %>%
  kable_classic()%>%
  kable_styling(latex_options = "HOLD_position",
                full_width=FALSE) 
Table 1: Specifing the Distributions of \(f_\theta(\theta)\) and \(f_\phi^{direct}(\phi)\)
\(\theta\) \(f_\theta(\theta)\) \(M(\theta)=\phi\) \(f_\phi^{direct}(\phi)\)
1 0.3 1 0.4
2 0.2 2 0.6
3 0.5 2 0.6

We see that \(M\) is not invertible since \(\theta=1\) and \(\theta = 2\) both map to \(\phi=2\), which implies the inverse \(M^{-1}\) would not be well defined.

If we can determine \(f_\phi^{induced}\), we calculate \(M(\theta)\) and the corresponding probability mass values for each possible value \(M(\theta)\) can take on.

That is, we have \[\begin{align*} f_\phi^{induced}(1) &= f_{\theta}(1) = 0.3 & \text{ (since $\theta = 1$ maps to $\phi = 1$) } \\ f_\phi^{induced}(2) &= f_{\theta}(2) + f_{\theta}(3) = 0.2 + 0.5= 0.7 & \text{ (since $\theta = 2$ and $\theta=3$ both map to $\phi = 2$) .} \end{align*}\]

Then, we can compute the logarithmically pooled pooled prior. If we have weights \(\alpha=\{0.5,\;0.5\}\), then the normalizing constant \(t(\alpha)\) is given by

\[\begin{align*} t(\alpha) &= \left[ \sum_\Phi (f_\phi^{induced}(\phi))^{\alpha} (f_\phi^{direct}(\phi))^{1-\alpha} \right]^{-1} \\ &= \left[ \sum_{\phi \in \{1,2\}} (f_\phi^{induced}(\phi))^{0.5} (f_\phi^{direct}(\phi))^{0.5} \right]^{-1} \\ &= \Big[ (f_\phi^{induced}(1))^{0.5} (f_\phi^{direct}(1))^{0.5} + (f_\phi^{induced}(2))^{0.5} (f_\phi^{direct}(2))^{0.5} \Big]^{-1} . \tag{1}\\ \end{align*}\]

Using values of \(f^{induced}\) calculated above and values of \(f^{direct}\) from Table 1, we obtain

\[\begin{align*} f_\phi^{induced}(1))^{0.5} (f_\phi^{direct}(1))^{0.5}&= (0.3)^{0.5}(0.4)^{0.5} =0.3464\\ f_\phi^{induced}(2))^{0.5} (f_\phi^{direct}(2))^{0.5} &= (0.7)^{0.5} (0.6)^{0.5}=0.6481, \end{align*}\]

so we can plug these terms into (1) to yield

\[ t(\alpha) = [0.3464 +0.6481 ]^{-1}. \]

Denoting the pooled prior in \(\phi\)-space as \(f_\phi^{pooled}(\phi)\), we have

\[\begin{align*} f_\phi^{pooled}(1) &= t(a)(f_\phi^{induced}(1))^{0.5} (f_\phi^{direct}(1))^{0.5} \\ &= [0.3464 + 0.6481]^{-1} (0.3464) = 0.3483\\ f_\phi^{pooled}(2) &= t(a)(f_\phi^{induced}(2))^{0.5} (f_\phi^{direct}(2))^{0.5} \\ &= [0.3464 + 0.6481]^{-1} (0.6481) =0.6517. \end{align*}\]

We summarize these results and compare \(f_\phi^{induced}, f_\phi^{direct}\), and \(f_\phi^{pooled}\) in Figure 2, the specific values for each probability mass are in Table 2 .

df <- tibble::tibble(
             # `$\\theta$` = c(1,2,3),
             # `$f_\theta(\\theta)$` = c(.3,.2,.5),
             `$\\phi$` = c(1,2),
             `$f_\\phi^{direct}(\\phi)$` = c(0.4,0.6),
            `$f_\\phi^{induced}(\\phi)$` = c(0.3, 0.7),
            `$f_\\phi^{pooled}(\\phi)$` = c(0.3483,0.6517 ))
df %>% 
  select(`$\\phi$`,
         `$f_\\phi^{induced}(\\phi)$`,
         `$f_\\phi^{direct}(\\phi)$`,
         `$f_\\phi^{pooled}(\\phi)$`) %>%
  pivot_longer(cols=c(`$f_\\phi^{induced}(\\phi)$`,
                      `$f_\\phi^{direct}(\\phi)$`,
                      `$f_\\phi^{pooled}(\\phi)$`)) %>%
  mutate(name = factor(name, levels = c("$f_\\phi^{induced}(\\phi)$",
                                        "$f_\\phi^{pooled}(\\phi)$",
                                        "$f_\\phi^{direct}(\\phi)$"
                                        ))) %>%
  ggplot(aes(x = `$\\phi$`, ymin = 0, ymax = value, color = name)) +
  geom_linerange(position=position_dodge(width = .2)) +
  geom_point(aes(y=value), size =2,position=position_dodge(width = .2)) +
  theme_bw() +
  theme(axis.title.y = element_text(size = 9),
        axis.text = element_text(size = 7),
        axis.title.x = element_text(size = 11),
       # legend.position = "none",
        strip.text = element_text(size = 10),
       legend.text = element_text(size = 10)) +
  viridis::scale_color_viridis(discrete = TRUE, end = .8, begin = .2,
                               labels = c(`$f_\\phi^{induced}(\\phi)$` =TeX("$f_\\phi^{induced}(\\phi)$"),
                                   `$f_\\phi^{direct}(\\phi)$`= TeX("$f_\\phi^{direct}(\\phi)$"), 
                                  `$f_\\phi^{pooled}(\\phi)$`= TeX("$f_\\phi^{pooled}(\\phi)$"))) +
  scale_x_continuous(breaks = c(1,1.5,2)) +
  labs(x = TeX("$\\phi$"),
       y = "Probability Mass",
       color = "")
Comparing the direct, induced, and pooled distributions on $\phi$. Each is a discrete distribution where $\phi$ can take on only values 1 or 2.  Note that the probability mass of a given value of $\phi$ for the pooled distribution falls in between the probability masses of the induced and direct distributions for that value of $\phi$.

Figure 2: Comparing the direct, induced, and pooled distributions on \(\phi\). Each is a discrete distribution where \(\phi\) can take on only values 1 or 2. Note that the probability mass of a given value of \(\phi\) for the pooled distribution falls in between the probability masses of the induced and direct distributions for that value of \(\phi\).

kbl(df, escape=FALSE,
    caption="Comparing the probability mass values of the direct, induced, and pooled distributions on $\\phi$.") %>%
  kable_classic()%>%
  kable_styling(latex_options = "HOLD_position")
Table 2: Comparing the probability mass values of the direct, induced, and pooled distributions on \(\phi\).
\(\phi\) \(f_\phi^{direct}(\phi)\) \(f_\phi^{induced}(\phi)\) \(f_\phi^{pooled}(\phi)\)
1 0.4 0.3 0.3483
2 0.6 0.7 0.6517

However, we also want the pooled prior on the inputs \(\theta\), that is, \(f_\theta^{pooled}(\theta)\). Poole and Raftery (2000) reasoned as follows. Since \(M\) uniquely maps \(\theta=1\) to \(\phi =1\), the probability that \(\theta=1\) should be equal to the probability \(\phi = 1\). That is, we should have \(f_\theta^{pooled}(1) = f_\phi^{pooled}(1)\).

However, the relationship for \(\theta=2\) or \(\theta=3\) to \(\phi\) is not one to one. Since \(M(2)=2\) and \(M(3)=2\), the sum of the probabilities for \(\theta=1\) and \(\theta=2\) should be equal to that for \(\phi=2\), that is, \(f_\theta^{pooled}(2) + f_\theta^{pooled}(3) = f_\phi^{pooled}(2) = 0.6517\).

The challenge here is how we divide the probability for \(f_\phi^{pooled}(2)\), which is defined, among \(f_\theta^{pooled}(2)\) and \(f_\theta^{pooled}(3)\). The prior for \(\phi\) yields no information to assist in this choice, because knowing which value \(\phi\) takes on does not give us any information about whether \(\theta=2\) or \(\theta=3\). Thus, the information we have about \(\theta\) must be taken from \(f_\theta(\theta)\).

That is, we can assign a probability for \(f_\theta^{pooled}(2)\) by considering the probability that \(\theta = 2\) relative to the probability \(\theta =3\), computing

\[f_\theta^{pooled}(2) = f_\phi^{pooled}(2) \Big( \frac{f_\theta(2)}{f_\theta(2) + f_\theta(3)}\Big).\]

If the probability \(\theta\) takes on the value \(2\) is lower in this case than the probability \(\theta=3\) which we know from the prior on \(\theta\), \(f_\theta(\theta)\), then the pooled prior on \(\theta\), denoted \(f_\theta^{pooled}(2)\), should reflect this.

Using this reasoning, we have \[\begin{align*} f_\theta^{pooled}(2) &= (0.7) \frac{0.2}{0.2+0.5} = 0.1862\\ f_\theta^{pooled}(3) &= (0.7) \frac{0.5}{0.2+0.5} = 0.4655. \end{align*}\]

The result in this simple example, using \(f_\theta(\theta)\) to determine how to distribute the probability for values of \(\phi\) where multiple \(\theta\) map to \(\phi\), can be used to derive general formulas to compute \(f_\theta^{pooled}(\theta)\) for discrete and continuous distributions (Poole and Raftery 2000).

General Solution for the Discrete Case

Denote the possible values of \(\theta\) as \(A_1, A_2, \dots\), the possible values of \(\phi\) as \(B_1, B_2, \dots\), and a mapping \(m: \mathbb{N} \to \mathbb{N}\) such that \(M(A_i) = B_{m(i)}\) and \(C_j = M^{-1}(B_j) = \{A_i : M(A_i) = B_j\}\). Then

\[f_\theta^{pooled}(A_i) = f_\phi^{pooled}(B_{m(i)}) \left( \frac{f_\theta(A_i)}{f_\phi^{induced}(B_{m(i)})} \right).\]

General Solution for the Continuous Case

We denote \(B = M(A) = \{M(\theta) : \theta \in A \}\) and \(C = M^{-1}(B) = \{\theta: M(\theta) \in B \}\).

Then

\[ f_\phi^{pooled} (M(\theta)) =t({\alpha}) f_\theta(\theta) \left( \frac{f_\phi^{direct}(M(\theta))}{f_\phi^{induced}(M(\theta))} \right)^{1-\alpha} \tag{2} \] where \(t({\alpha})\) is a renormalizing constant for the choice of \(\alpha\).

Implementation through the Sampling-Importance-Resampling Algorithm

We can obtain the pooled distributions \(f^{pooled}_\theta\) and \(f^{pooled}_\phi\) by using the Sampling-Importance-Resampling Algorithm.

The steps are as follows.

  1. We draw \(m\) observations of \(\theta\) from its prior distribution \(f_\theta(\theta)\).
  2. For every \(\theta_i\) we compute \(\phi_i = M(\theta_i)\) to obtain a sample from the induced distribution.
  3. Since the density \(f_\phi^{induced}(\phi)\) is unlikely to have an analytical form, we can compute it via a density approximation such as kernel density estimation.
  4. Construct weights proportional to the ratio of the prior on \(\phi\) evaluated at \(M(\theta_i)\) to the induced prior \(f_\phi^{induced}\) evaluated at \(M(\theta_i)\). If a likelihood \(L_1(\theta)\) for the inputs and \(L_2(\phi)\) is available, the weights are \[w_i = \left( \frac{f_\phi^{direct}(M(\theta_i))}{f_\phi^{induced}(M(\theta_i))} \right)^{1-\alpha}L_1(\theta_i) \; L_2(M(\theta_i)).\] However, in this work, no likelihood is available for the variables of interest, so the likelihood is left out of the weights, leaving us with \[w_i = \left( \frac{f_\phi^{direct}(M(\theta_i))}{f_\phi^{induced}(M(\theta_i))} \right)^{1-\alpha}.\]
  5. Sample \(r\) observations from \(\theta\) and \(\phi\) from step (1) with probabilities proportional to the weights from (4).

Bayesian Melding Applied to COVID-19 Misclassification

In this work, we can relate the inputs \(\theta = \{P(S_1|untested), \alpha, \beta \}\) and \(\phi = P(S_0|test +,untested)\) by the deterministic model \(M: \theta \to \phi\) given by \[P(S_0|test+, untested) = \frac{\beta(1 - P(S_1|untested))}{\beta(1-P(S_1|untested)) + \alpha P(S_1|untested)}.\] The derivation of \(M\) is in the following section.

Now, we have two distributions on \(\phi\): the distribution based on data on the asymptomatic rate of infection of COVID-19, and the distribution formed by taking \(M(\theta)\) where \(\theta\) represents the values from the defined distributions of \(\alpha,\beta,\) and \(P(S_1|untested)\). With Bayesian melding, we pool these distributions using logarithmic pooling, and then implement the sampling-importance-resampling algorithm to obtain constrained distributions of the inputs \(\theta\) that are in accordance with information about the asymptomatic rate of the virus.

Due to the uncertainty around our definitions of \(\alpha\) and \(\beta\), it is particularly useful to leverage the information we have about the asymptomatic rate of the virus \(P(S_0|test +,untested)\) because a large collection of studies has been published in this area. In a meta-analysis pooling data from 95 studies, the pooled estimate among the confirmed population that was asymptomatic was 40.50% [95% CI, 33.50%-47.50%] (Ma et al. 2021). Another meta-analysis including 350 studies estimated the asymptomatic percentage to be 36.9% [95% CI: 31.8 to 42.4%], and, when restricting to screening studies, 47.3% (95% CI: 34.0% -61.0%) (Sah et al. 2021).

This means we have two priors on the asymptomatic rate \(\phi\), that by taking \(M(\theta)\) for sampled values of \(\theta\), denoted \(f_\phi^{induced}(M(\theta))\) in the previous section, and that based on data about the asymptomatic rate, \(f_\phi^{direct}(\phi)\).

Distribution of \(\theta = \{\alpha, \beta, P(S_1|untested) \}\)

First, sampling the values of \(\theta\), we have:

library(latex2exp)



############################################################    
# BETA SHAPE PARAMETERS WITH DESIRED MEAN AND SD
############################################################    

get_beta_params <- function(mu, sd) {
  var = sd^2
  alpha <- ((1 - mu) / var - 1 / mu) * mu ^ 2
  beta <- alpha * (1 / mu - 1)
  return(params = list(alpha = alpha, beta = beta))
}
library(latex2exp)



###############################################################
# BETA PARAMETERS FROM DESIRED MEAN AND VARIANCE
###############################################################
get_beta_params <- function(mu, sd) {
    var = sd^2
    alpha <- ((1 - mu) / var - 1 / mu) * mu ^ 2
    beta <- alpha * (1 / mu - 1)
    return(params = list(alpha = alpha,
                         beta = beta))
}




###############################################################
# BETA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################
beta_density <- function(x, mean, sd, bounds=NA) {
    shape_params <-  get_beta_params(
        mu = mean,
        sd = sd)

    if(!length(bounds) == 1){
        # message("here")
        dtrunc(x,
               spec = "beta",
               a = bounds[1],
               b = bounds[2],
              shape1 = shape_params$alpha,
              shape2 = shape_params$beta) %>%
            return()
    }else{
        dbeta(x,
          shape1 = shape_params$alpha,
          shape2 = shape_params$beta)  %>%
            return()
        }
}




###############################################################
# SAMPLE FROM BETA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################

sample_beta_density <- function(n, mean, sd, bounds = NA) {

    shape_params <-  get_beta_params(
        mu = mean,
        sd = sd)

    rbeta(n,
          shape1 = shape_params$alpha,
          shape2 = shape_params$beta)

    if(!length(bounds) == 1){
        # message("here")
        rtrunc(n,
               spec = "beta",
               a = bounds[1],
               b = bounds[2],
               shape1 = shape_params$alpha,
               shape2 = shape_params$beta) %>%
            return()
    }else{
        rbeta(n,
              shape1 = shape_params$alpha,
              shape2 = shape_params$beta)  %>%
            return()
    }
}




###############################################################
# GAMMA PARAMETERS FROM DESIRED MEAN AND VARIANCE
###############################################################
get_gamma_params <- function(mu, sd) {
    var = (mu/sd)^2
    shape = (mu/sd)^2
    scale = sd^2/mu
    return(params = list(shape = shape,
                         scale = scale))
}


###############################################################
# GAMMA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################
gamma_density <- function(x, mean, sd, bounds=NA) {

    shape_params <-  get_gamma_params(
        mu = mean,
        sd = sd)

    if(!length(bounds) == 1){
        #message("here")
        dtrunc(x,
               spec = "gamma",
               a = bounds[1],
               b = bounds[2],
               shape = shape_params$shape,
               scale = shape_params$scale) %>%
            return()
    }else{
        dgamma(x,
               shape = shape_params$shape,
               scale = shape_params$scale) %>%
            return()
    }
}


sample_gamma_density <- function(n, mean, sd, bounds = NA) {

    shape_params <-  get_gamma_params(
        mu = mean,
        sd = sd)

    if(!length(bounds) == 1){
        #message("here")
        rtrunc(n,
               spec = "gamma",
               a = bounds[1],
               b = bounds[2],
               shape = shape_params$shape,
               scale = shape_params$scale) %>%
            return()
    }else{
        rgamma(n,
               shape = shape_params$shape,
               scale = shape_params$scale) %>%
            return()
    }
}





###############################################################
# INDUCED PRIOR ON ASYMPTOMATIC RATE  P(S_0|test+,untested)
###############################################################

# input sampled values of theta and compute M(\theta)
est_P_A_testpos = function(P_S_untested, alpha, beta){
    (beta * (1 - P_S_untested)) / (( beta * (1 - P_S_untested)) + (alpha * P_S_untested))
}
# set prior parameters
alpha_mean = .95
alpha_sd = 0.08
alpha_bounds = NA
 # alpha_bounds = c(.8,1),
beta_mean = .15
beta_sd =.09
beta_bounds = NA
#  beta_bounds = c(0.002, 0.4),
s_untested_mean = .03
s_untested_sd = .0225
#  s_untested_bounds = c(0.0018, Inf),
s_untested_bounds = NA
p_s0_pos_mean = .4
p_s0_pos_sd = .1225
p_s0_pos_bounds = NA
#  p_s0_pos_bounds = c(.25, .7),
pre_nsamp = 1e5
post_nsamp = 1e4

theta <- tibble(alpha = sample_gamma_density(pre_nsamp,
                                                mean = alpha_mean,
                                                sd = alpha_sd,
                                                bounds = alpha_bounds),
                    beta= sample_beta_density(pre_nsamp,
                                              mean = beta_mean,
                                              sd = beta_sd,
                                              bounds = beta_bounds),
                    P_S_untested = sample_beta_density(pre_nsamp,
                                                       mean = s_untested_mean,
                                                       sd = s_untested_sd,
                                                       bounds = s_untested_bounds)) %>%
        mutate(phi_induced = est_P_A_testpos(P_S_untested = P_S_untested,
                                             alpha = alpha,
                                             beta=beta))



theta %>% select(-phi_induced) %>%
  rename( "$\\alpha$" = alpha,
          "$\\beta$" = beta,
         "$P(S_1|untested)$" = P_S_untested ) %>%
  pivot_longer(cols= everything()) %>%
  ggplot(aes(x=value)) +
  geom_density(alpha = .8, fill = "black") +
  theme_bw() +
   theme(plot.title = element_text(size = 18, 
                                   face="bold",
                                   hjust = .5),
          strip.text = element_text(size = 18, color="white"),
         strip.background=element_rect(fill = "#363636"),
         axis.title = element_text(size = 18)
          )  +
   labs(x = "Value",
        y = "Density",
        title = TeX("Sampling from $\\theta = \\alpha,\\beta, P(S_1|untested)$")
       ) +
  facet_wrap(~name,
             labeller=  as_labeller(TeX,
                                    default = label_parsed)
             )

# ggsave("./img/theta.png", dpi = 800)

Prior and Induced Prior Distributions for \(P(S_0|test+,untested)\)

Then, taking \(M(\theta)\), we can compute the induced distribution \(f_\phi^{induced}(M(\theta))\) and compare it to our prior on \(\phi\) from meta-analyses on the asymptomatic rate (\(f_\phi^{direct}(\phi)\)).

########################################################################################
# COMPARING DIRECT AND INDUCED PRIORS ON ASYMPTOMATIC RATE  P(S_0|test+,untested)
########################################################################################


theta %>%
  mutate(phi_prior = sample_beta_density(pre_nsamp, 
                                         mean = p_s0_pos_mean, 
                                         sd =p_s0_pos_sd,
                                         bounds = p_s0_pos_bounds)) %>%
  pivot_longer(c(phi_prior, 
                 phi_induced), names_to = "density") %>%
  mutate(density = ifelse(density == "phi_induced", 
                          "Induced Prior Distribution", 
                          "Original Prior Distribution")) %>%
  ggplot(aes(x=value, fill = density)) +
  geom_density(alpha = .8) +
  theme_bw() + 
  theme(legend.position = "right",
          legend.text = element_text(size = 14),
          plot.title = element_text(size = 18),
          axis.text.y = element_blank()) +
  scale_fill_manual(values = c("#577C9C", "#56BFA8")) +
  labs(title = TeX("$\\overset{Distribution \\, on \\,P(S_0|untested, +)}{and \\, Induced \\, Distribution \\, on \\, P(S_0|untested, +)}$"),
       fill = "",
       y = "Density",
       x = "Value")

Resampling

Then, we can apply Bayesian melding to generate constrained distributions for the inputs \(\theta\) and output \(\phi\).

We already have sampled values of \(\theta\) from the prior \(f_\theta(\theta)\) (i.e., the defined distributions for \(\alpha\), \(\beta\), and \(P(S_1|untested)\)) in the following data frame. The column phi_induced is \(M(\theta)\), representing sampled values from the induced distribution \(f_\phi^{induced}(\phi)\).

theta

We perform a kernel density estimation to approximate the density of \(f_\phi^{induced}(\phi)\).

The density function returns a list where \(x\) is the coordinates of the points where the density is estimated, and \(y\) is the estimated density values.

# theta contains values sampled from alpha, beta, P_S_untested, and M(theta) = phi_induced
# induced phi
phi <- theta$phi_induced

# approximate $f_\phi^{induced}(\phi)$ via a density approximation
phi_induced_density <- density(x = phi, n = pre_nsamp, adjust = 2, kernel = "gaussian")

glimpse(phi_induced_density)
## List of 7
##  $ x        : num [1:100000] -0.0668 -0.0668 -0.0668 -0.0668 -0.0667 ...
##  $ y        : num [1:100000] 3.35e-06 3.36e-06 3.36e-06 3.36e-06 3.37e-06 ...
##  $ bw       : num 0.0254
##  $ n        : int 100000
##  $ call     : language density.default(x = phi, adjust = 2, kernel = "gaussian", n = pre_nsamp)
##  $ data.name: chr "phi"
##  $ has.na   : logi FALSE
##  - attr(*, "class")= chr "density"

We see that the kernel density estimation estimates the density at coordinates outside the range of the sampled values of \(\phi\).

tibble(`Coordinates Where Density is Estimated` = phi_induced_density$x, 
       `Sampled values of Phi Induced` = theta$phi_induced) %>%
  pivot_longer(cols=everything()) %>%
  ggplot(aes(x=value, fill = name)) +
  geom_density(alpha = .6) +
  theme_bw() +
  theme(plot.title = element_text(hjust = .5, size = 16),
        legend.text = element_text(size = 14)) +
  labs(fill = "",
       title = TeX(paste0("$\\overset{Comparing\\; Coordinates\\; Where",
       "\\;Density \\;is\\; Estimated}{ to \\; Sampled\\; Values\\; of\\; \\phi}$")))

This section iterates over every value of the samples from the induced distribution on \(\phi\) and finds the coordinate \(x\) closest to that value of \(\phi\) and stores the density value at this coordinate. This means the vector phi_sampled_density represents the densities at the coordinates closest to the sampled values of \(\phi\) from the induced prior. We verify this by looking at the extracted coordinates and the sampled values of \(\phi\).

library(tictoc)
library(future)


##############
# ORIGINAL
##############
tic()
 phi_sampled_density <- unlist(parallel::mclapply(
    X = phi,
    FUN = function(p){
    phi_induced_density$y[which(phi_induced_density$x > p)[1]]
  }))
toc()
## 51.341 sec elapsed
############################
# MAP IMPLEMENTATION
############################
library(future)
library(tictoc)
future::plan(multisession, workers = 3)
tic()
indexes <- furrr::future_map(phi, ~{ 
  which(phi_induced_density$x > .x)[1]
  }) %>%
  unlist()
toc()
## 47.199 sec elapsed
# check that map method is equivalent
all.equal(phi_sampled_density, phi_induced_density$y[indexes])
## [1] TRUE
tibble(closest_coords =phi_induced_density$x[indexes], 
       sampled_induced = theta$phi_induced) %>%
  pivot_longer(cols=everything()) %>%
  ggplot(aes(x=value, fill = name)) +
  geom_density(alpha = .5) +
  theme_bw() +
  theme(plot.title = element_text(hjust = .5, size = 16),
        legend.text = element_text(size = 14)) +
  labs(fill = "",
       title =  TeX(paste0("$\\overset{Comparing\\; Coordinates\\; Where",
       "\\;Density \\;is\\; Estimated}{ to \\; Sampled\\; Values\\; of\\; \\phi}$"))) +
  scale_fill_manual(labels = c(
    closest_coords = TeX("Closest Coordinates to Sampled Values of $\\phi$"),
    sampled_induced =  TeX("Sampled values of Induced Distribution for $\\phi$")),
                    values = c("blue", "yellow"))

# not equal, but very close
all.equal(phi_induced_density$x[indexes],theta$phi_induced)
## [1] "Mean relative difference: 7.093745e-06"
phi_sampled_density <- phi_induced_density$y[indexes]

Now, as described in the section regarding the weights of the Sampling-Importance-Resampling algorithm, the weights are \(w_i = \left( \frac{f_\phi^{direct}(M(\theta_i))}{f_\phi^{induced}(M(\theta_i))} \right)^{1-\alpha}.\)

However, we compute \(weights =\left( \frac{f_\phi^{induced}(M(\theta_i))}{f_\phi^{direct}(M(\theta_i))} \right)^{1-\alpha}\) and then account for this with the argument prob=1/weights when sampling with these weights.

Here, phi_sampled_density represents \(f_\phi^{induced}(\phi)\). It is the density at (approximately) the sampled values of \(\phi\) where the density \(f_\phi^{direct}\) is being evaluated.

This means applying the function (phi_sampled_density_i/ dp_s0_pos(phi_i))^(1-alpha) to each component of the phi vector is computing the density (approximately) at each value of phi from the induced density \(f_\phi^{induced}(\phi)\) and dividing this by the prior density \(f_\phi^{direct}(\phi)\) evaluated at this value of phi.

dp_s0_pos <- function(x){
  truncdist::dtrunc(x = x,spec = "beta",a = 0.25,b = 0.7,
                    shape1 = get_beta_params(mu = 0.4, sd = (0.35)^2)$alpha,
                    shape2 = get_beta_params(mu = 0.4, sd = (0.35)^2)$beta)
}
rp_s0_pos <- function(x){
  truncdist::rtrunc(x,spec = "beta",a = 0.25,b = 0.7,
                    shape1 = get_beta_params(mu = 0.4, sd = (0.35)^2)$alpha,
                    shape2 = get_beta_params(mu = 0.4, sd = (0.35)^2)$beta)
}



# dp_s0_pos = P(A|test+, untested)
################
# ORIGINAL
################
weights <- parallel::mcmapply(
    FUN = function(p,
                   phi_sampled_density,
                   alpha){
      (phi_sampled_density/ dp_s0_pos(p))^(1-alpha)
    },
    p=phi, 
    phi_sampled_density=phi_sampled_density,
    MoreArgs = list(alpha=0.5))

################
# REWRITTEN
################
weights2 <- map2_dbl(
  phi_sampled_density,
  phi,
  function(phi_sampled_density_i, phi_i) {
    # pooling weight
    alpha = .5
    (phi_sampled_density_i/ dp_s0_pos(phi_i))^(1-alpha) 
  }
)


weights3 <- (phi_sampled_density/ dp_s0_pos(phi))^(.5)

all.equal(weights2,weights3)
## [1] TRUE
# check equivalence
all.equal(weights,weights2)
## [1] TRUE

Once we have these weights, we resample the posterior to obtain a sample from the target distribution \(k_\alpha \Big( f^{induced}(M(\theta)) \Big)^{0.5} \Big( f^{direct} (M(\theta)) \Big)^{0.5}\), where \(k_\alpha\) is the normalizing constant needed to make the pooled density valid.

# resample the posterior
post_samp_ind <-sample.int(n=pre_nsamp,
                           size=post_nsamp, 
                           prob=1/weights,
                           replace=T)


pi_samp <- cbind(theta[post_samp_ind,], 
                 P_A_testpos =  phi[post_samp_ind]) %>%
  select(-phi_induced)

pi_samp_long <- pi_samp %>%
  pivot_longer(cols=everything()) %>%
  mutate(type = "After Melding")

compare_melded <- theta %>%
  mutate(P_A_testpos = sample_beta_density(pre_nsamp, 
                                         mean = p_s0_pos_mean, 
                                         sd =p_s0_pos_sd,
                                         bounds = p_s0_pos_bounds)) %>%
  pivot_longer(cols=everything()) %>%
  mutate(type = ifelse(
    name == "phi_induced",
  "Induced", "Before Melding")) %>%
  mutate(name = ifelse(name == "phi_induced", "P_A_testpos", name)) %>%
  bind_rows(pi_samp_long) %>%
  mutate(name = case_when(
    name == "alpha" ~"$\\alpha$",
    name == "beta" ~"$\\beta$",
    name == "P_A_testpos" ~ "$P(S_0|test+,untested)$",
    name == "P_S_untested" ~ "$P(S_1|untested)$")
  ) %>%
  mutate(name = factor(name,
                       levels = c(
                         "$\\alpha$",
                         "$\\beta$",
                         "$P(S_1|untested)$",
                         "$P(S_0|test+,untested)$"))) 

library(cowplot)


p <- compare_melded %>%
  ggplot(aes(x = value, fill = type)) +
  geom_density(alpha = .5) +
  facet_wrap(~name,
             labeller = as_labeller(
               TeX,
               default = label_parsed), ncol = 3) + 
  theme_c(axis.text.y = element_blank(),
          axis.ticks.y = element_blank(),
          axis.title = element_text(size = 18),
          axis.text.x = element_text(size = 12),
          plot.title =element_text(size = 25, margin =margin(.5,.5,.5,.5)),
          strip.text = element_text(size = 16, color="white"),
          legend.text = element_text(size = 16)) +
  labs(
       fill = "",
       y = "Density") +
  scale_fill_manual(values = c("#5670BF", "#418F6A", "#B28542"))



p1 <- p %+% subset(compare_melded, 
                   name %in% c( "$\\alpha$",
                         "$\\beta$",
                         "$P(S_1|untested)$")) %+%
  theme(legend.position = "none")

p2 <-   p %+% subset(compare_melded, 
                     name == "$P(S_0|test+,untested)$")

legend <- get_legend(
  p1
)

cowplot::plot_grid(
  p1,
  p2,
  nrow = 2
)

# ggsave("./img/melded.png", dpi = 700)

More on the Intuition for the Weights

Formal Definition of Logarithmic Pooling

As outlined in Carvalho et al. (2023), we can formally define logarithmic pooling as follows.

If we have a set of densities \(\{ f_1(\phi), f_2(\phi), \ldots, f_n(\phi)\}\) and corresponding pooling weights \(\boldsymbol{\alpha}=\{\alpha_1, \alpha_2, \ldots, \alpha_n\}\), then the pooled density is

\[ t(\boldsymbol{\alpha}) \prod_{i=0}^n f_i(\phi)^{\alpha_i}\] where \(t(\boldsymbol{\alpha})\) is the normalizing constant \(t(\boldsymbol{\alpha}) = \dfrac{1}{ \int_{\Phi}\prod_{i=0}^n f_i(\phi)^{\alpha_i} d\phi}\) to ensure the pooled density is a valid probability density.

The case for this work is more simple: we only have two densities we wish to pool, \(f_\phi^{induced}\) and \(f_\phi^{direct}\), and we assign them equal weights by letting \(\boldsymbol{\alpha} = \{.5, .5\}\). This yields

\[f^{pooled}(\phi) = \left( f^{induced} (\phi) \right)^{0.5} \left( f^{direct} (\phi) \right)^{0.5}.\]

Theorem 1: Distribution After Resampling with Sampling Weights

Suppose we have a random variable \(X\) with PDF \(f_X\) and are sampling with weights \(h(x)\) for a nonnegative function \(h\). We are interested in determining the distribution after sampling.1

Letting \(U\) denote the event that \(X\) was sampled, the probability that a given value of \(X\) is sampled is \(P(U=1|X=x) = h(x)\). We want the post-sampling distribution, that is, \(P(X \leq z | U = 1)\).

By the definition of conditional probability we have \(P(X \leq z | U = 1) = \dfrac{P(X \leq z , U = 1)}{ P(U=1)}\),

and since we will always be sampling at least some values, \(P(U=1)\) will always be a nonzero constant.

This gives us

\[P(X \leq z | U = 1) = \dfrac{1}{ P(U=1)} P(X \leq z , U = 1) \tag{1}.\] Since the joint probability is \[P(X \leq z, U= 1) = P(U=1|X\leq z) \; P(X\leq z),\] we write it as the integral \[P(X \leq z, U = 1) = \int^z P(U=1|X=x) \; f_X(x) \; dx.\]

Substituting this expression of the joint probability into (1), we have

\[P(X \leq z | U = 1) = \dfrac{1}{ P(U=1)} \int^z P(U=1|X=x) \; f_X(x) \; dx.\]

As defined, \(h(x) = P(U=1|X=x)\), so we have

\[P(X \leq z | U = 1) = \dfrac{1}{ P(U=1)} \int^z h(x) \; f_X(x) \; dx.\] \[ \propto \int^z h(x) \; f_X(x) \; dx.\]

Demonstrative Examples

The most trivial case is when \(h(x) =1\), and as such, we sample all \(X\) with equal weight, obtaining the original distribution.

However, there are less trivial examples where \(P(X \leq z | U = 1)\) is in the form of a recognizable distribution.

Example 1:

For example, suppose \(X \sim Exp(\lambda)\), so \(f_X(x) = \lambda e^{-\lambda x}\), and we sample with weights direction proportional to \(X\), that is, \(h(x) = x\).

Then \[P(X \leq z | U = 1) \propto \int^z h(x) \; f_X(x) \; dx = \int^z x \cdot \lambda e^{-\lambda x} dx\] and from the PDF of the gamma distribution, \(f(x) = \dfrac{\beta^\alpha}{\Gamma(\alpha) }x^{\alpha - 1} e^{-\beta x}\) we can recognized that \(x \cdot e^{-\lambda x}\) corresponds to the gamma distribution with \(\alpha = 2\) and \(\beta = \lambda\).

We can see this result by considering \(X\) before and after sampling below.

library(latex2exp)
library(viridis)
nsamp <- 1e6




##########################################
# EXAMPLE 1
##########################################

# weights are proportional to random variable itself
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=phi_sim)


dat <- tibble(`Before Sampling` = phi_sim,
       `After Sampling` = phi_sim[post_samp_ind]) %>%
  pivot_longer(cols = everything()) %>%
  mutate(name = factor(name, 
                       levels = c("Before Sampling", 
                                  "After Sampling")))



##########################################
# PLOT *NOT* INCLUDING THEORETICAL DENSITY
##########################################
dat %>% 
  # filter(name == "post_melding") %>%
  ggplot()  +
  geom_density(aes(x=value, fill = name),
               alpha = .6,
               color = NA) +
  labs(title = TeX(
    paste0("Sampling from $X \\sim Exp(\\lambda =  ",
           input_rate, ")$ with Weights $h(x) =x$")),
  # subtitle =TeX(paste0("PDF of Gamma$(2, \\lambda)$ in red")),
  fill = "",
  y= "Density") +
  theme_bw() +
  theme(plot.title = element_text(hjust = .5, size = 18),
        plot.subtitle = element_text(hjust = .5, size = 16),
        legend.text = element_text(size = 18)) +
  scale_fill_manual(values = c("#0C2B67", "#DE8600")) 

Then, we can see that the PDF of the the gamma distribution with \(\alpha = 2\) and \(\beta = \lambda\) corresponds to the post-sampling distribution as expected.

library(ggtext)

##########################################
# PLOT INCLUDING THEORETICAL DENSITY
##########################################
color_lab <-  paste0("PDF of Gamma(2, ", input_rate, ")")

dat %>% 
  ggplot()  +
  geom_density(aes(x=value, fill = name),
               alpha = .6,
               color = NA) +
  labs(title = TeX(
    paste0("Sampling from $X \\sim Exp(\\lambda =  ",
           input_rate, ")$ with Weights $h(x) =x$")),
  subtitle =(paste0("PDF of Gamma(2, ",
                       input_rate,
                       ") in <span style = 'color:darkred'>Red</span>")),
  fill = "",
  y = "Density") +
  theme_bw() +
  theme(plot.title = element_text(hjust = .5, size = 18),
        plot.subtitle = element_markdown(hjust = .5, size = 16),
        legend.text = element_text(size = 18)) +
  stat_function(fun = dgamma, 
                args=list(shape=2,
                          scale=1/input_rate), 
                aes(color = color_lab),
                size = 1.2) +
  scale_color_manual(name = "", 
                     values = c("darkred")) +
  scale_fill_manual(values = c("#0C2B67", "#DE8600")) +
  guides(color = guide_legend(
    override.aes = list(size = 4)))

Example 2:

Similarly, again suppose \(X \sim Exp(\lambda)\), so \(f_X(x) = \lambda e^{-\lambda x}\). However, now we sample with weights defined by \(h(x)= e^{-\lambda x}\). Then \[P(X \leq z | U = 1) \propto \int^z h(x) \; f_X(x) \; dx = \int^z e^{-\lambda x} \cdot \lambda e^{-\lambda x} dx\] \[= \int^z \lambda e^{-2 \lambda x} dx\] which is proportional to the exponential distribution with parameter \(2\lambda\).

Plotting \(X\) before and after sampling, we have:

################################################################
# EXAMPLE 2
################################################################

# weights are proportional to exp(-rate * random_variable)
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=exp(phi_sim*-1*input_rate))


dat <- tibble(`Before Sampling` = phi_sim,
       `After Sampling` = phi_sim[post_samp_ind]) %>%
  pivot_longer(cols = everything())  %>%
  mutate(name = factor(name, 
                       levels = c("Before Sampling", 
                                  "After Sampling")))

library(latex2exp)

##########################################
# PLOT *NOT* INCLUDING THEORETICAL DENSITY
##########################################
dat %>% 
  ggplot()  +
  geom_density(aes(x=value, fill = name),
               alpha = .6,
               color = NA) +
  # stat_function(fun = dexp, 
  #               args=list(rate = 2*input_rate), 
  #               color = "red") +
  labs(title = TeX(
    paste0("Sampling from $X \\sim Exp(\\lambda=", input_rate, ") $ with Weights ",
  "$e^{-\\lambda \\; x}$")),
  #subtitle =TeX("PDF of $Exp(2*\\lambda)$ in Red"),
  fill = "",
  y = "Density") +
  theme_bw() + 
  theme(plot.title = element_text(hjust = .5, size = 18),
        plot.subtitle = element_text(hjust = .5, size = 16),
        legend.text = element_text(size = 18)) +
  scale_fill_manual(values = c("#0C2B67", "#DE8600"))

and then plotting the PDF of the exponential distribution with parameter \(2\lambda\) we can see the correspondence to the post-sampling distribution.

color_lab <-  paste0("PDF of Exp(2 * ", input_rate, ")")

##########################################
# PLOT INCLUDING THEORETICAL DENSITY
##########################################
dat %>% 
  ggplot()  +
  geom_density(aes(x=value, fill = name),
               alpha = .6,
               color = NA) +
  stat_function(fun = dexp,
                args=list(rate = 2*input_rate),
           #     color = "darkred",
                size=1.2,
                aes(color = color_lab)) +
  labs(title = TeX(
    paste0("Sampling from $X \\sim Exp(\\lambda=", 
           input_rate, 
           ") $ with Weights ",
  "$e^{-\\lambda \\; x}$")),
  subtitle =TeX("PDF of $Exp(2*\\lambda)$ in Red"),
  fill = "",
  y = "Density") +
  theme_bw() + 
  theme(plot.title = element_text(hjust = .5, size = 18),
        plot.subtitle = element_text(hjust = .5, size = 16),
        legend.text = element_text(size = 18)) +
  scale_fill_manual(values = c("#0C2B67", "#DE8600"))  +
  scale_color_manual(name = "", 
                     values = c("darkred")) +
  guides(color = guide_legend(
    override.aes = list(size = 4)))

Justification for Melding Implementation

In the implementation of melding in this work, we have \(\phi = \{ \phi_1, \ldots, \phi_n \}\) is a vector containing a sample where \(\phi_i \sim f_{\phi}^{induced}\).

Considering the \(i^{th}\) element \(\phi_i\) , the density is \(f_{\phi}^{induced}(\phi_i)\).

With sample.int, we sample the index \(i\) with weight \(\left(\dfrac{f_\phi^{direct}(\phi_i)}{f_{\phi}^{induced}(\phi_i)}\right)^{0.5}\). That is, \(h(\phi_i) = \left(\dfrac{f_\phi^{direct}(\phi_i)}{f_{\phi}^{induced}(\phi_i)}\right)^{0.5}\).

We want to define the distribution of the \(\phi\) after resampling with these weights, denoted \(f_\phi^{post}\).

By Theorem 1, the density at \(f_\phi^{post}(\phi_i)\) then is \[f_\phi^{post} (\phi_i) = \left( h(\phi_i) \right) \; \left( f_{\phi}^{induced}(\phi_i)\right)\] and plugging in our expression of \(h\) we have

\[f_\phi^{post} (\phi_i) = \left(\dfrac{f_\phi^{direct}(\phi_i)}{f_{\phi}^{induced}(\phi_i)}\right)^{0.5} \; \left( f_{\phi}^{induced}(\phi_i)\right).\] Simplifying, we have

\[f_\phi^{post} (\phi_i) = \left({f_\phi^{direct}(\phi_i)}\right)^{0.5} \; \left( f_{\phi}^{induced}(\phi_i)\right)^{0.5}.\]

Hence, the density after resampling is the log-pooled density of \(f_\phi^{direct}\) and \(f_\phi^{induced}\).

library(latex2exp)

nsamp <- 1e5


######################################################
# EXAMPLE 1: ALL INDEXES GIVEN SAME WEIGHT
######################################################

# h(x) = 1
# suppose we know the distribution of phi
phi_sim <- rnorm(nsamp, mean =0, sd =1)

# tibble(x = runif(nsamp, 0, 1)) %>%
#   ggplot(aes(x=x)) + geom_histogram()

 # dunif(.5, 0, 1)


# simplest example, indexes all given equal weight (uniform 0 to 1)
post_samp_ind <-sample.int(n=nsamp,
                           size=nsamp, 
                           prob=rep(1,nsamp),
                           replace=T)

tibble(pre_melding = phi_sim,
       post_melding = phi_sim[post_samp_ind]) %>%
  pivot_longer(cols = everything()) %>%
  ggplot(aes(x=value, fill = name)) +
  geom_density(alpha = .5)

# density of any point phi_i is dunif(ind_i, 0, 1) * dnorm(phi_1, 0, 1)




################################################################
# EXAMPLE 2
################################################################

# weights are proportional to random variable itself
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=phi_sim)


dat <- tibble(pre_sampling = phi_sim,
       post_sampling = phi_sim[post_samp_ind]) %>%
  pivot_longer(cols = everything()) 

dat %>% 
  # filter(name == "post_melding") %>%
  ggplot()  +
  geom_density(aes(x=value, fill = name),
               alpha = .5) +
  stat_function(fun = dgamma, args=list(shape=2, scale=1/input_rate), color = "red")




################################################################
# EXAMPLE 3
################################################################

# weights are proportional to exp(-rate * random_variable)
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=exp(phi_sim*-1*input_rate))


dat <- tibble(pre_sampling = phi_sim,
       post_sampling = phi_sim[post_samp_ind]) %>%
  pivot_longer(cols = everything()) 

library(latex2exp)
dat %>% 
  ggplot()  +
  geom_density(aes(x=value, fill = name),
               alpha = .5,
               color = NA) +
  stat_function(fun = dexp, 
                args=list(rate = 2*input_rate), 
                color = "red") +
  labs(title = TeX(
    paste0("Sampling from $X \\sim Exp(\\lambda) $ with Weights Proportional to ",
  "$e^{-\\lambda \\; x}$ with $\\lambda$ = ", input_rate)),
  subtitle =TeX("PDF of $Exp(2*\\lambda)$ in red"),
  fill = "") +
  theme_bw() +
  theme(plot.title = element_text(hjust = .5, size = 16),
        plot.subtitle = element_text(hjust = .5, size = 16),
        legend.text = element_text(size = 18))

################################################################
# EXAMPLE 2: INDEXES DISTRIBUTED BETA(shape1 = 2, shape2 = 2)
################################################################

phi_sim <- rnorm(nsamp, mean = 4, sd = 30)

tibble(x=phi_sim) %>%
  ggplot(aes(x=x)) +
  geom_density()

weights <- dbeta(seq(0,1, length = nsamp), shape1=12,shape2=12)

ggplot() +
  stat_function(fun = dbeta, args =list(shape1=12,shape2=12))

post_samp_ind <-sample.int(n=nsamp,
                           size=nsamp, 
                           prob=weights,
                           replace=T)

df <- tibble(x = phi_sim,
             weights = weights,
             dens = dnorm(phi_sim, mean = 4, sd = 30),
             y = weights*dens) %>%
  slice_sample(n = 5000)

dat <- tibble(pre_melding = phi_sim,
       post_melding = phi_sim[post_samp_ind]) %>%
  pivot_longer(cols = everything()) 

density(dat$post_melding, n=nsamp)


ggplot()  +
 geom_point(data=df,
 aes(x=x, y=y ),
 color = "red",
 inherit.aes=FALSE)

dat %>% 
  # filter(name == "post_melding") %>%
  ggplot()  +
  geom_point(data=df, 
            aes(x=x, y=y ),
            color = "red",
            inherit.aes=FALSE) +
  geom_density(aes(x=value, fill = name),
               alpha = .5)


# helpful
# https://stackoverflow.com/questions/59918865/what-happens-when-prob-argument-in-sample-sums-to-less-greater-than-1

Summary of the Implementation Procedure

In the implementation of Bayesian melding, we sample values of \(\theta = \{ \alpha, \beta, P(S_1|untested)\}\). Then, we use the function \(M: \theta \to \phi\) defined by \(M(\theta) = \dfrac{\beta(1- P(S_1|\text{untested}))}{\beta(1-P(S_1|untested)) + \alpha P(S_1|untested)}\), yielding the induced distribution \(f_\phi^{induced}\). Because we do not have an analytical form for \(f_\phi^{induced}\), we must estimate it using some density estimation approach. Kernel density estimation is standard for this step.

Then, we compute the sampling weights \(\left(\dfrac{f_\phi^{direct}(\phi_i)}{f_{\phi}^{induced}(\phi_i)}\right)^{0.5}\). Here, \(f_\phi^{direct}(\phi_i)\) is calculated using the density function \(f_\phi^{direct}\), which is defined based on information from meta-analyses on the asymptomatic rate, and \(f_{\phi}^{induced}(\phi_i)\) is based on kernel density estimates. Sampling with these weights from our sample of \(M(\theta) = \phi\) and our sample of \(\theta\) yields the post-melding distributions, where the post-melding density of \(\phi\) is the the logarithmic pool of \(f_\phi^{direct}\) and \(f_\phi^{induced}\), that is, \(f_\phi^{post} (\phi_i) = \left({f_\phi^{direct}(\phi_i)}\right)^{0.5} \; \left( f_{\phi}^{induced}(\phi_i)\right)^{0.5}.\)

nsamp <- 1e6

theta <- tibble(alpha = rtrunc_alpha(nsamp),
       beta= rtrunc_beta(nsamp),
       P_S_untested = rtrunc_s1_untested(nsamp))


theta <- theta %>%
  mutate(phi_induced = est_P_A_testpos(P_S_untested = P_S_untested,
                                       alpha = alpha, 
                                       beta=beta))


phi <- theta$phi_induced

phi_induced_density <- density(x = phi, n = nsamp, adjust = 2, kernel = "gaussian")

############################
# MAP IMPLEMENTATION
############################
future::plan(multisession, workers = 3)
indexes <- furrr::future_map(phi, ~{ 
  which(phi_induced_density$x > .x)[1]
  }) %>%
  unlist()

phi_sampled_density <- phi_induced_density$y[indexes]

weights3 <- (phi_sampled_density/ dp_s0_pos(phi))^(1-.5)

nsamp <- 1e6
post_samp_ind <-sample.int(n=nsamp,
                           size=nsamp, 
                           prob=phi_sampled_density,
                           replace=T)

tibble(after = phi[post_samp_ind],
       before = phi) %>%
  pivot_longer(cols= everything()) %>%
  ggplot(aes(x = value, fill = name)) +
  geom_density()



post_samp_ind <-sample.int(n=nsamp,
                           size=nsamp_post, 
                           prob=1/weights,
                           replace=T)


pi_samp <- cbind(theta[post_samp_ind,], 
                 P_A_testpos =  phi[post_samp_ind]) %>%
  select(-phi_induced)

More on Importance Sampling Resampling

Setting Pre and Post Number of Samples for Simple Example

nsamp <- 1e6

##########################################
# EXAMPLE 1
##########################################

nsamp <- 1e4


approx_target <- function(pre_nsamp, post_nsamp, input_rate) {
  
  set.seed(123)
  phi_sim <- rexp(pre_nsamp, rate = input_rate)
  # weights proportional to random variable itself
  set.seed(123)
  post_samp_ind <- sample.int(n = pre_nsamp, size = post_nsamp, replace=TRUE, prob=phi_sim)
  message(paste0("Post: ", length(post_samp_ind), ", Pre: ", length(phi_sim)))
  
  tibble(name = "Before Sampling",
         value = phi_sim) %>%
    bind_rows(
      tibble(name = "After Sampling",
             value = phi_sim[post_samp_ind])
    )
  # tibble(`Before Sampling` = phi_sim,
  #        `After Sampling` = phi_sim[post_samp_ind]) %>%
  #   pivot_longer(cols = everything()) %>%
  #   mutate(name = factor(name, 
  #                        levels = c("Before Sampling", 
  #                                   "After Sampling")))
  

}


compare_theoretical <- function(melded,
                                input_rate,
                                pre_nsamp, 
                                post_nsamp,
                                title_size = 18) {
  color_lab <-  paste0("PDF of Gamma(2, ", input_rate, ")")

  melded %>% 
    ggplot()  +
    geom_density(aes(x=value, fill = name),
                 alpha = .6,
                 color = NA) +
    labs(title = TeX(
      paste0("Sampling from $X \\sim Exp(\\lambda =  ",
             input_rate, ")$ with Weights $h(x) =x$")),
    subtitle =(paste0("PDF of Gamma(2, ",
                         input_rate,
                         ") in <span style = 'color:darkred'>Red</span>",
                      ", Sample Size: ", pre_nsamp,
                      ",<br>Posterior Sample Size:", post_nsamp)),
    fill = "",
    y = "Density") +
    theme_bw() +
    theme(axis.title = element_text(size = title_size - 2),
          plot.title = element_text(hjust = .5, size = title_size),
          plot.subtitle = element_markdown(hjust = .5, size = title_size-2),
          legend.text = element_text(size = title_size),
          strip.text=element_text(size=15, color="white"),
          strip.background = element_rect(fill="#363636")) +
    stat_function(fun = dgamma, 
                  args=list(shape=2,
                            scale=1/input_rate), 
                  aes(color = color_lab),
                  size = .9) +
    scale_color_manual(name = "", 
                       values = c("darkred")) +
    scale_fill_manual(values = c("#0C2B67", "#DE8600")) +
    guides(color = guide_legend(
      override.aes = list(size = 4)))

}

rate <- .2
# melded <- approx_target(presamp = 1e5,
#                         postsamp = 1e4, 
#                         input_rate = rate) 
# compare_theoretical(melded, input_rate = rate)

walk(c(1e4, 5e4, 1e5, 1e6),~{

  melded_df <- approx_target(pre_nsamp = .x,
                        post_nsamp = 1e4, 
                        input_rate = rate) 
  plt <- compare_theoretical(melded_df, input_rate = rate, 
                             pre_nsamp = .x,
                             post_nsamp = 1e4) +
    xlim(0,55) +
    ylim(0, .18)
  print(plt)
})

walk(c(1e4, 5e4, 1e5, 1e6),~{

  melded_df <- approx_target(pre_nsamp = .x,
                        post_nsamp = 1e4, 
                        input_rate = rate) %>%
    filter(name == "After Sampling")
  plt <- compare_theoretical(melded_df, input_rate = rate, 
                             pre_nsamp = .x,
                             post_nsamp = 1e4) +
    xlim(0,55) +
    ylim(0, .075)
  print(plt)
})

plots <- map(c(1e4, 5e4, 1e5, 1e6),~{

  melded_df <- approx_target(pre_nsamp = .x,
                        post_nsamp = 1e4, 
                        input_rate = rate) %>%
    filter(name == "After Sampling")
  plt <- compare_theoretical(melded_df, input_rate = rate, 
                             pre_nsamp = .x,
                             post_nsamp = 1e4,
                             title_size = 8) +
    xlim(0,55) +
    ylim(0, .075) })

# weights are proportional to random variable itself
input_rate = .2
phi_sim <- rexp(nsamp, rate = input_rate)
post_samp_ind <- sample.int(nsamp, replace=TRUE, prob=phi_sim)


dat <- tibble(`Before Sampling` = phi_sim,
       `After Sampling` = phi_sim[post_samp_ind]) %>%
  pivot_longer(cols = everything()) %>%
  mutate(name = factor(name, 
                       levels = c("Before Sampling", 
                                  "After Sampling")))



##########################################
# PLOT *NOT* INCLUDING THEORETICAL DENSITY
##########################################
dat %>% 
  # filter(name == "post_melding") %>%
  ggplot()  +
  geom_density(aes(x=value, fill = name),
               alpha = .6,
               color = NA) +
  labs(title = TeX(
    paste0("Sampling from $X \\sim Exp(\\lambda =  ",
           input_rate, ")$ with Weights $h(x) =x$")),
  # subtitle =TeX(paste0("PDF of Gamma$(2, \\lambda)$ in red")),
  fill = "",
  y= "Density") +
  theme_bw() +
  theme(plot.title = element_text(hjust = .5, size = 18),
        plot.subtitle = element_text(hjust = .5, size = 16),
        legend.text = element_text(size = 18)) +
  scale_fill_manual(values = c("#0C2B67", "#DE8600")) 


##########################################
# PLOT INCLUDING THEORETICAL DENSITY
##########################################
color_lab <-  paste0("PDF of Gamma(2, ", input_rate, ")")

dat %>% 
  ggplot()  +
  geom_density(aes(x=value, fill = name),
               alpha = .6,
               color = NA) +
  labs(title = TeX(
    paste0("Sampling from $X \\sim Exp(\\lambda =  ",
           input_rate, ")$ with Weights $h(x) =x$")),
  subtitle =(paste0("PDF of Gamma(2, ",
                       input_rate,
                       ") in <span style = 'color:darkred'>Red</span>")),
  fill = "",
  y = "Density") +
  theme_bw() +
  theme(plot.title = element_text(hjust = .5, size = 18),
        plot.subtitle = element_markdown(hjust = .5, size = 16),
        legend.text = element_text(size = 18)) +
  stat_function(fun = dgamma, 
                args=list(shape=2,
                          scale=1/input_rate), 
                aes(color = color_lab),
                size = 1.2) +
  scale_color_manual(name = "", 
                     values = c("darkred")) +
  scale_fill_manual(values = c("#0C2B67", "#DE8600")) +
  guides(color = guide_legend(
    override.aes = list(size = 4)))

Setting Pre and Post Number of Samples for COVID-19 Misclassification

###############################################################
# BETA PARAMETERS FROM DESIRED MEAN AND VARIANCE
###############################################################
get_beta_params <- function(mu, sd) {
    var = sd^2
    alpha <- ((1 - mu) / var - 1 / mu) * mu ^ 2
    beta <- alpha * (1 / mu - 1)
    return(params = list(alpha = alpha,
                         beta = beta))
}




###############################################################
# BETA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################
beta_density <- function(x, mean, sd, bounds=NA) {
    shape_params <-  get_beta_params(
        mu = mean,
        sd = sd)

    if(!length(bounds) == 1){
        # message("here")
        dtrunc(x,
               spec = "beta",
               a = bounds[1],
               b = bounds[2],
              shape1 = shape_params$alpha,
              shape2 = shape_params$beta) %>%
            return()
    }else{
        dbeta(x,
          shape1 = shape_params$alpha,
          shape2 = shape_params$beta)  %>%
            return()
        }
}




###############################################################
# SAMPLE FROM BETA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################

sample_beta_density <- function(n, mean, sd, bounds = NA) {

    shape_params <-  get_beta_params(
        mu = mean,
        sd = sd)

    rbeta(n,
          shape1 = shape_params$alpha,
          shape2 = shape_params$beta)

    if(!length(bounds) == 1){
        # message("here")
        rtrunc(n,
               spec = "beta",
               a = bounds[1],
               b = bounds[2],
               shape1 = shape_params$alpha,
               shape2 = shape_params$beta) %>%
            return()
    }else{
        rbeta(n,
              shape1 = shape_params$alpha,
              shape2 = shape_params$beta)  %>%
            return()
    }
}




###############################################################
# GAMMA PARAMETERS FROM DESIRED MEAN AND VARIANCE
###############################################################
get_gamma_params <- function(mu, sd) {
    var = (mu/sd)^2
    shape = (mu/sd)^2
    scale = sd^2/mu
    return(params = list(shape = shape,
                         scale = scale))
}


###############################################################
# GAMMA DENSITY WITH DESIRED MEAN AND VARIANCE
###############################################################
gamma_density <- function(x, mean, sd, bounds=NA) {

    shape_params <-  get_gamma_params(
        mu = mean,
        sd = sd)

    if(!length(bounds) == 1){
        #message("here")
        dtrunc(x,
               spec = "gamma",
               a = bounds[1],
               b = bounds[2],
               shape = shape_params$shape,
               scale = shape_params$scale) %>%
            return()
    }else{
        dgamma(x,
               shape = shape_params$shape,
               scale = shape_params$scale) %>%
            return()
    }
}


sample_gamma_density <- function(n, mean, sd, bounds = NA) {

    shape_params <-  get_gamma_params(
        mu = mean,
        sd = sd)

    if(!length(bounds) == 1){
        #message("here")
        rtrunc(n,
               spec = "gamma",
               a = bounds[1],
               b = bounds[2],
               shape = shape_params$shape,
               scale = shape_params$scale) %>%
            return()
    }else{
        rgamma(n,
               shape = shape_params$shape,
               scale = shape_params$scale) %>%
            return()
    }
}





###############################################################
# INDUCED PRIOR ON ASYMPTOMATIC RATE  P(S_0|test+,untested)
###############################################################

# input sampled values of theta and compute M(\theta)
est_P_A_testpos = function(P_S_untested, alpha, beta){
    (beta * (1 - P_S_untested)) / (( beta * (1 - P_S_untested)) + (alpha * P_S_untested))
}
# set prior parameters
prior_params <- list(
  alpha_mean = .95,
  alpha_sd = 0.08,
  alpha_bounds = NA,
 # alpha_bounds = c(.8,1),
  beta_mean = .15,
  beta_sd =.09,
  beta_bounds = NA,
#  beta_bounds = c(0.002, 0.4),
  s_untested_mean = .03,
  s_untested_sd = .0225,
#  s_untested_bounds = c(0.0018, Inf),
  s_untested_bounds = NA,
  p_s0_pos_mean = .4,
  p_s0_pos_sd = .1225,
 p_s0_pos_bounds = NA,
#  p_s0_pos_bounds = c(.25, .7),
  pre_nsamp = 1e6,
  post_nsamp = 1e5)
# needed for plot_melded
library(patchwork)

get_melded <- function(alpha_mean = 0.9,
                       alpha_sd = 0.04,
                       alpha_bounds = NA,
                       beta_mean = .15,
                       beta_sd =.09,
                       beta_bounds = NA,
                       s_untested_mean = .025,
                       s_untested_sd = .0225,
                       s_untested_bounds = NA,
                       p_s0_pos_mean = .4,
                       p_s0_pos_sd = .1225,
                       p_s0_pos_bounds = NA,
                       pre_nsamp = 1e6,
                       post_nsamp = 1e5) {

  given_args <- as.list(environment())
   cat("Arguments to get_melded:\n")
   print(given_args)

    theta <- tibble(alpha = sample_gamma_density(pre_nsamp,
                                                mean = alpha_mean,
                                                sd = alpha_sd,
                                                bounds = alpha_bounds),
                    beta= sample_beta_density(pre_nsamp,
                                              mean = beta_mean,
                                              sd = beta_sd,
                                              bounds = beta_bounds),
                    P_S_untested = sample_beta_density(pre_nsamp,
                                                       mean = s_untested_mean,
                                                       sd = s_untested_sd,
                                                       bounds = s_untested_bounds)) %>%
        mutate(phi_induced = est_P_A_testpos(P_S_untested = P_S_untested,
                                             alpha = alpha,
                                             beta=beta))
    
   # message(paste0("nrows of theta: ", nrow(theta)))

    # theta contains values sampled from alpha, beta, P_S_untested, and M(theta) = phi_induced
    # induced phi
    phi <- theta$phi_induced

    # approximate induced distribution via a density approximation
    phi_induced_density <- density(x = phi, n = pre_nsamp, adjust = 2, kernel = "gaussian")

    indexes <- findInterval(phi, phi_induced_density$x)

    phi_sampled_density <- phi_induced_density$y[indexes]

    dp_s0_pos <- function(x) {

      beta_density(x,
                   mean=p_s0_pos_mean,
                   sd = p_s0_pos_sd,
                   bounds=p_s0_pos_bounds)
    }

    # weights <- (phi_sampled_density/ dp_s0_pos(phi))^(.5)
    weights <- (dp_s0_pos(phi)/phi_sampled_density)^(0.5)

    post_samp_ind <-sample.int(n=pre_nsamp,
                               size=post_nsamp,
                              # prob=1/weights,
                              prob = weights,
                               replace=TRUE)

    post_melding <- bind_cols(theta[post_samp_ind,],
                     P_A_testpos =  phi[post_samp_ind]) %>%
        select(-phi_induced)

     return(list(post_melding = post_melding, pre_melding = theta))
  #  return(post_melding)
}



#' reformat for plot generation
reformat_melded <- function(melded_df,
                            theta_df,
                            pre_nsamp,
                            p_s0_pos_mean,
                            p_s0_pos_sd,
                            p_s0_pos_bounds) {

  melded_df_long <- melded_df %>%
    pivot_longer(cols=everything()) %>%
    mutate(type = "After Melding")


  melded <- theta_df %>%
    mutate(P_A_testpos = sample_beta_density(pre_nsamp,
                                             mean = p_s0_pos_mean,
                                             sd = p_s0_pos_sd,
                                             bounds = p_s0_pos_bounds)) %>%
    pivot_longer(cols=everything()) %>%
    mutate(type = ifelse(
      name == "phi_induced",
      "Induced", "Before Melding")) %>%
    mutate(name = ifelse(name == "phi_induced",
                         "P_A_testpos",
                         name)) %>%
    bind_rows(melded_df_long) %>%
    mutate(name = case_when(
      name == "alpha" ~"$\\alpha$",
      name == "beta" ~"$\\beta$",
      name == "P_A_testpos" ~ "$P(S_0|test+,untested)$",
      name == "P_S_untested" ~ "$P(S_1|untested)$")
    ) %>%
    mutate(name = factor(name,
                         levels = c(
                           "$\\alpha$",
                           "$\\beta$",
                           "$P(S_1|untested)$",
                           "$P(S_0|test+,untested)$")))

}


plot_melded <- function(melded, custom_title="", pre_nsamp, post_nsamp) {
  
  
  p1 <- melded %>%
    filter(name != "$P(S_0|test+,untested)$") %>%
    ggplot(aes(x = value, fill = type)) +
    geom_density(alpha = .5, show.legend=FALSE) +
    facet_wrap(~name,
               labeller = as_labeller(
                 TeX,   default = label_parsed),
               ncol = 3,
               scales = "fixed") +
    theme_bw() +
    theme(
          # axis.text.y = element_blank(),
          # axis.ticks.y = element_blank(),
          axis.title = element_text(size = 18),
          axis.text.x = element_text(size = 10),
          plot.title =element_text(size = 18,
                                   margin =margin(0,0, .5,0, 'cm')),
          strip.text = element_text(size = 16, color="white"),
          strip.background=element_rect(fill ="#363636"),
          legend.text = element_text(size = 16),
          plot.subtitle = element_text(face = "bold", size = 18)) +
    labs(title = TeX(custom_title,bold=TRUE),
         subtitle =paste0("Sample Size: ", pre_nsamp,
                          "\nPosterior Sample Size:", post_nsamp),
         fill = "",
         y = "Density") +
    scale_fill_manual(values = c("#5670BF", "#418F6A","#B28542")) +
    guides(fill = guide_legend(keyheight = 2,  keywidth = 2))
  
  p2 <- melded %>%
    filter(name == "$P(S_0|test+,untested)$") %>%
    ggplot(aes(x = value, fill = type)) +
    geom_density(alpha = .5) +
    facet_wrap(~name,
               labeller = as_labeller(
                 TeX,   default = label_parsed),
               ncol = 3,
               scales = "fixed") +
    theme_bw() +
    theme(
          # axis.text.y = element_blank(),
          # axis.ticks.y = element_blank(),
          axis.title = element_text(size = 18),
          axis.text.x = element_text(size = 10),
          plot.title =element_text(size = 18,
                                   margin =margin(0,0, .5,0, 'cm')),
          strip.text = element_text(size = 16, color="white"),
          strip.background=element_rect(fill ="#363636"),
          legend.text = element_text(size = 18)) +
    labs(
      #title = paste0("Number of Samples: ", nsamp),
         fill = "",
         y = "Density") +
    scale_fill_manual(values = c("#5670BF", "#418F6A","#B28542")) +
    guides(fill = guide_legend(keyheight = 2,  keywidth = 2)) +
    xlim(0,1)
  
  
  p1 / p2 +  plot_layout(nrow =2, widths = c(4,1))
  
}
params <- prior_params

melded <- do.call(get_melded, params)
       

melded_long <- reformat_melded(melded_df =  melded$post_melding,
                               theta_df =  melded$pre_melding,
                               p_s0_pos_mean = params$p_s0_pos_mean,
                               p_s0_pos_sd = params$p_s0_pos_sd,
                               p_s0_pos_bounds = params$p_s0_pos_bounds,
                               pre_nsamp = params$pre_nsamp)

plot_melded(melded_long, pre_nsamp = params$pre_nsamp, post_nsamp = params$post_nsamp)

# set prior parameters
prior_params <- list(
  alpha_mean = .95,
  alpha_sd = 0.08,
  alpha_bounds = NA,
 # alpha_bounds = c(.8,1),
  beta_mean = .15,
  beta_sd =.09,
  beta_bounds = NA,
#  beta_bounds = c(0.002, 0.4),
  s_untested_mean = .03,
  s_untested_sd = .0225,
#  s_untested_bounds = c(0.0018, Inf),
  s_untested_bounds = NA,
  p_s0_pos_mean = .4,
  p_s0_pos_sd = .1225,
# p_s0_pos_bounds = NA,
  p_s0_pos_bounds = c(.25, .7),
  pre_nsamp = 1e4,
  post_nsamp = 1e4)


walk(c(1e4, 1e5, 1e6), ~{
  params <- prior_params
  params$pre_nsamp <- .x
  melded <- do.call(get_melded, params)
  melded_long <- reformat_melded(melded_df =  melded$post_melding,
                                 theta_df =  melded$pre_melding,
                                 p_s0_pos_mean = params$p_s0_pos_mean,
                                 p_s0_pos_sd = params$p_s0_pos_sd,
                                 p_s0_pos_bounds = params$p_s0_pos_bounds,
                                 pre_nsamp = params$pre_nsamp)
  
  plt <- plot_melded(melded_long, pre_nsamp = params$pre_nsamp, post_nsamp = params$post_nsamp)
  print(plt)
})

References

Carvalho, Luiz M., Daniel A. M. Villela, Flavio C. Coelho, and Leonardo S. Bastos. 2023. “Bayesian Inference for the Weights in Logarithmic Pooling.” Bayesian Analysis 18 (1). https://doi.org/10.1214/22-BA1311.
Genest, Christian, Kevin J. McConway, and Mark J. Schervish. 1986. “Characterization of Externally Bayesian Pooling Operators.” The Annals of Statistics 14 (2): 487–501. https://www.jstor.org/stable/2241231.
Ma, Qiuyue, Jue Liu, Qiao Liu, Liangyu Kang, Runqing Liu, Wenzhan Jing, Yu Wu, and Min Liu. 2021. “Global Percentage of Asymptomatic SARS-CoV-2 Infections Among the Tested Population and Individuals With Confirmed COVID-19 Diagnosis: A Systematic Review and Meta-analysis.” JAMA Network Open 4 (12): e2137257. https://doi.org/10.1001/jamanetworkopen.2021.37257.
Poole, David, and Adrian E. Raftery. 2000. “Inference for Deterministic Simulation Models: The Bayesian Melding Approach.” Journal of the American Statistical Association 95 (452): 1244–55. https://doi.org/10.1080/01621459.2000.10474324.
Powers, Kimberly A, Azra C Ghani, William C Miller, Irving F Hoffman, Audrey E Pettifor, Gift Kamanga, Francis EA Martinson, and Myron S Cohen. 2011. “The Role of Acute and Early HIV Infection in the Spread of HIV and Implications for Transmission Prevention Strategies in Lilongwe, Malawi: A Modelling Study.” The Lancet 378 (9787): 256–68. https://doi.org/10.1016/S0140-6736(11)60842-8.
Robson, Barbara J. 2014. “When Do Aquatic Systems Models Provide Useful Predictions, What Is Changing, and What Is Next?” Environmental Modelling & Software 61: 287–96. https://doi.org/10.1016/j.envsoft.2014.01.009.
Sah, Pratha, Meagan C. Fitzpatrick, Charlotte F. Zimmer, Elaheh Abdollahi, Lyndon Juden-Kelly, Seyed M. Moghadas, Burton H. Singer, and Alison P. Galvani. 2021. “Asymptomatic SARS-CoV-2 Infection: A Systematic Review and Meta-Analysis.” Proceedings of the National Academy of Sciences 118 (34): e2109229118. https://doi.org/10.1073/pnas.2109229118.
Ševčíková, Hana, Adrian E. Raftery, and Paul A. Waddell. 2007. “Assessing Uncertainty in Urban Simulations Using Bayesian Melding.” Transportation Research Part B: Methodological 41 (6): 652–69. https://doi.org/10.1016/j.trb.2006.11.001.

  1. A formulation of this problem is here.↩︎